15 research outputs found
Recommended from our members
A-Lister: a tool for analysis of differentially expressed omics entities across multiple pairwise comparisons.
BackgroundResearchers commonly analyze lists of differentially expressed entities (DEEs), such as differentially expressed genes (DEGs), differentially expressed proteins (DEPs), and differentially methylated positions/regions (DMPs/DMRs), across multiple pairwise comparisons. Large biological studies can involve multiple conditions, tissues, and timepoints that result in dozens of pairwise comparisons. Manually filtering and comparing lists of DEEs across multiple pairwise comparisons, typically done by writing custom code, is a cumbersome task that can be streamlined and standardized.ResultsA-Lister is a lightweight command line and graphical user interface tool written in Python. It can be executed in a differential expression mode or generic name list mode. In differential expression mode, A-Lister accepts as input delimited text files that are output by differential expression tools such as DESeq2, edgeR, Cuffdiff, and limma. To allow for the most flexibility in input ID types, to avoid database installation requirements, and to allow for secure offline use, A-Lister does not validate or impose restrictions on entity ID names. Users can specify thresholds to filter the input file(s) by column(s) such as p-value, q-value, and fold change. Additionally, users can filter the pairwise comparisons within the input files by fold change direction (sign). Queries composed of intersection, fuzzy intersection, difference, and union set operations can also be performed on any number of pairwise comparisons. Thus, the user can filter and compare any number of pairwise comparisons within a single A-Lister differential expression command. In generic name list mode, A-Lister accepts delimited text files containing lists of names as input. Queries composed of intersection, fuzzy intersection, difference, and union set operations can then be performed across these lists of names.ConclusionsA-Lister is a flexible tool that enables the user to rapidly narrow down large lists of DEEs to a small number of most significant entities. These entities can then be further analyzed using visualization, pathway analysis, and other bioinformatics tools
Evolving Simple Models of Diverse Intrinsic Dynamics in Hippocampal Neuron Types
The diversity of intrinsic dynamics observed in neurons may enhance the computations implemented in the circuit by enriching network-level emergent properties such as synchronization and phase locking. Large-scale spiking network models of entire brain regions offer a platform to test theories of neural computation and cognitive function, providing useful insights on information processing in the nervous system. However, a systematic in-depth investigation requires network simulations to capture the biological intrinsic diversity of individual neurons at a sufficient level of accuracy. The computationally efficient Izhikevich model can reproduce a wide range of neuronal behaviors qualitatively. Previous studies using optimization techniques, however, were less successful in quantitatively matching experimentally recorded voltage traces. In this article, we present an automated pipeline based on evolutionary algorithms to quantitatively reproduce features of various classes of neuronal spike patterns using the Izhikevich model. Employing experimental data from Hippocampome.org, a comprehensive knowledgebase of neuron types in the rodent hippocampus, we demonstrate that our approach reliably fit Izhikevich models to nine distinct classes of experimentally recorded spike patterns, including delayed spiking, spiking with adaptation, stuttering, and bursting. Importantly, by leveraging the parameter-exploration capabilities of evolutionary algorithms, and by representing qualitative spike pattern class definitions in the error landscape, our approach creates several suitable models for each neuron type, exhibiting appropriate feature variabilities among neurons. Moreover, we demonstrate the flexibility of our methodology by creating multi-compartment Izhikevich models for each neuron type in addition to single-point versions. Although the results presented here focus on hippocampal neuron types, the same strategy is broadly applicable to any neural systems
Differentiating between liver diseases by applying multiclass machine learning approaches to transcriptomics of liver tissue or blood-based samples.
Background & aimsLiver disease carries significant healthcare burden and frequently requires a combination of blood tests, imaging, and invasive liver biopsy to diagnose. Distinguishing between inflammatory liver diseases, which may have similar clinical presentations, is particularly challenging. In this study, we implemented a machine learning pipeline for the identification of diagnostic gene expression biomarkers across several alcohol-associated and non-alcohol-associated liver diseases, using either liver tissue or blood-based samples.MethodsWe collected peripheral blood mononuclear cells (PBMCs) and liver tissue samples from participants with alcohol-associated hepatitis (AH), alcohol-associated cirrhosis (AC), non-alcohol-associated fatty liver disease, chronic HCV infection, and healthy controls. We performed RNA sequencing (RNA-seq) on 137 PBMC samples and 67 liver tissue samples. Using gene expression data, we implemented a machine learning feature selection and classification pipeline to identify diagnostic biomarkers which distinguish between the liver disease groups. The liver tissue results were validated using a public independent RNA-seq dataset. The biomarkers were computationally validated for biological relevance using pathway analysis tools.ResultsUtilizing liver tissue RNA-seq data, we distinguished between AH, AC, and healthy conditions with overall accuracies of 90% in our dataset, and 82% in the independent dataset, with 33 genes. Distinguishing 4 liver conditions and healthy controls yielded 91% overall accuracy in our liver tissue dataset with 39 genes, and 75% overall accuracy in our PBMC dataset with 75 genes.ConclusionsOur machine learning pipeline was effective at identifying a small set of diagnostic gene biomarkers and classifying several liver diseases using RNA-seq data from liver tissue and PBMCs. The methodologies implemented and genes identified in this study may facilitate future efforts toward a liquid biopsy diagnostic for liver diseases.Lay summaryDistinguishing between inflammatory liver diseases without multiple tests can be challenging due to their clinically similar characteristics. To lay the groundwork for the development of a non-invasive blood-based diagnostic across a range of liver diseases, we compared samples from participants with alcohol-associated hepatitis, alcohol-associated cirrhosis, chronic hepatitis C infection, and non-alcohol-associated fatty liver disease. We used a machine learning computational approach to demonstrate that gene expression data generated from either liver tissue or blood samples can be used to discover a small set of gene biomarkers for effective diagnosis of these liver diseases
DataSheet1.DOCX
<p>The diversity of intrinsic dynamics observed in neurons may enhance the computations implemented in the circuit by enriching network-level emergent properties such as synchronization and phase locking. Large-scale spiking network models of entire brain regions offer a platform to test theories of neural computation and cognitive function, providing useful insights on information processing in the nervous system. However, a systematic in-depth investigation requires network simulations to capture the biological intrinsic diversity of individual neurons at a sufficient level of accuracy. The computationally efficient Izhikevich model can reproduce a wide range of neuronal behaviors qualitatively. Previous studies using optimization techniques, however, were less successful in quantitatively matching experimentally recorded voltage traces. In this article, we present an automated pipeline based on evolutionary algorithms to quantitatively reproduce features of various classes of neuronal spike patterns using the Izhikevich model. Employing experimental data from Hippocampome.org, a comprehensive knowledgebase of neuron types in the rodent hippocampus, we demonstrate that our approach reliably fit Izhikevich models to nine distinct classes of experimentally recorded spike patterns, including delayed spiking, spiking with adaptation, stuttering, and bursting. Importantly, by leveraging the parameter-exploration capabilities of evolutionary algorithms, and by representing qualitative spike pattern class definitions in the error landscape, our approach creates several suitable models for each neuron type, exhibiting appropriate feature variabilities among neurons. Moreover, we demonstrate the flexibility of our methodology by creating multi-compartment Izhikevich models for each neuron type in addition to single-point versions. Although the results presented here focus on hippocampal neuron types, the same strategy is broadly applicable to any neural systems.</p
Confusion matrices corresponding to the best gene and protein sets of the full datasets and the liver tissue validation datasets.
The Liver 3-way Full best gene and protein sets contained 33 genes and 27 proteins, respectively. The PBMC 3-Way Full best gene and protein sets contained 16 genes and 28 proteins, respectively. (A) Confusion matrix for classification of Liver 3-Way Full RNAseq dataset using best gene set identified by filter feature selection. The diagonal contains the number and percentage of the correctly predicted samples. (B) Confusion matrix for classification of AH, AC, and healthy control (CT) samples within independent validation RNAseq dataset. (C) Confusion matrix for classification of PBMC 3-Way Full RNAseq dataset using best gene set identified by filter feature selection. (D) Confusion matrix for classification of Liver 3-Way Full proteomic dataset using best protein set identified by filter feature selection. (E) Confusion matrix for classification of AH, AC, and CT samples within independent validation proteomic dataset. (F) Confusion matrix for classification of PBMC 3-Way Full proteomic dataset using best protein set identified by filter feature selection.</p
Study population demographics (PBMCs) for proteomic and RNAseq analysis.
Study population demographics (PBMCs) for proteomic and RNAseq analysis.</p
Best genes and proteins for each dataset.
For the integrated datasets, the matching genes and proteins are bolded.</p
Confusion matrices corresponding to the best gene and protein sets in the matched balanced data set tested separately, and tested with the integrated gene/protein set.
Confusion matrices corresponding to the best gene and protein sets (59 genes and 19 proteins, respectively) evaluated within Liver 3-Way Matched Balanced data and within PBMC 3-Way Matched Balanced data (16 genes and 33 proteins, respectively). (A) Confusion matrix for classification of Liver 3-Way Matched Balanced RNAseq dataset using best gene set identified by filter feature selection. (B) Confusion matrix for classification of Liver 3-Way Matched Balanced proteomic dataset using best protein set identified by filter feature selection. (C) Confusion matrix for classification of Liver 3-Way Matched Balanced dataset using a combination of best gene and protein sets. (D) Confusion matrix for classification of PBMC 3-Way Matched Balanced RNAseq dataset using best gene set identified by filter feature selection. (E) Confusion matrix for classification of PBMC 3-Way Matched Balanced proteomic dataset using best protein set identified by filter feature selection. (F) Confusion matrix for classification of PBMC 3-Way Matched Balanced dataset using a combination of best gene and protein sets.</p
Supplemental methods and supplemental results for this study.
Supplemental methods and supplemental results for this study.</p